Overview

Brought to you by YData

Dataset statistics

Number of variables20
Number of observations50000
Missing cells50134
Missing cells (%)5.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory33.7 MiB
Average record size in memory707.0 B

Variable types

Text4
Categorical5
Numeric11

Alerts

Aromaticity is highly overall correlated with Oxidized_coefficient and 1 other fieldsHigh correlation
Function_Prediction_source is highly overall correlated with Protein_sourceHigh correlation
Function_prediction_source is highly overall correlated with Phage_source and 1 other fieldsHigh correlation
Molecular_weight is highly overall correlated with Oxidized_coefficient and 1 other fieldsHigh correlation
Oxidized_coefficient is highly overall correlated with Aromaticity and 2 other fieldsHigh correlation
Phage_source is highly overall correlated with Function_prediction_source and 1 other fieldsHigh correlation
Protein_source is highly overall correlated with Function_Prediction_source and 2 other fieldsHigh correlation
Reduced_coefficient is highly overall correlated with Aromaticity and 2 other fieldsHigh correlation
Start is highly overall correlated with StopHigh correlation
Stop is highly overall correlated with StartHigh correlation
Protein_source is highly imbalanced (93.8%) Imbalance
Function_prediction_source has 22743 (45.5%) missing values Missing
Function_Prediction_source has 27257 (54.5%) missing values Missing
Protein_ID has unique values Unique
Aromaticity has 8299 (16.6%) zeros Zeros
Instability_index has 827 (1.7%) zeros Zeros
Helix_fraction has 2146 (4.3%) zeros Zeros
Turn_fraction has 3063 (6.1%) zeros Zeros
Sheet_fraction has 2487 (5.0%) zeros Zeros
Reduced_coefficient has 13787 (27.6%) zeros Zeros
Oxidized_coefficient has 13261 (26.5%) zeros Zeros

Reproduction

Analysis started2025-07-15 20:28:51.795512
Analysis finished2025-07-15 20:29:05.993714
Duration14.2 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

Distinct47819
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Memory size4.4 MiB
2025-07-15T22:29:06.067917image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length87
Median length85
Mean length34.49438
Min length5

Characters and Unicode

Total characters1724719
Distinct characters66
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45753 ?
Unique (%)91.5%

Sample

1st rowNC_001416.1
2nd rowNC_001629.1
3rd rowNC_001825.1
4th rowNC_001902.1
5th rowNC_001271.1
ValueCountFrequency (%)
nc_048047.1 6
 
< 0.1%
mgv-genome-0357750 4
 
< 0.1%
nc_042047.1 4
 
< 0.1%
imgvr_uvig_3300045988_177519|3300045988|ga0495776_003599 4
 
< 0.1%
uvig_458124 4
 
< 0.1%
mgv-genome-0379300 4
 
< 0.1%
uvig_25220 4
 
< 0.1%
mgv-genome-0376919 4
 
< 0.1%
mgv-genome-0376593 4
 
< 0.1%
imgvr_uvig_3300007222_000012|3300007222|ga0104061_100089 3
 
< 0.1%
Other values (47809) 49959
99.9%
2025-07-15T22:29:06.244153image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 189862
 
11.0%
_ 137375
 
8.0%
3 106332
 
6.2%
1 90559
 
5.3%
2 84027
 
4.9%
8 82093
 
4.8%
5 79942
 
4.6%
4 78529
 
4.6%
9 73036
 
4.2%
7 70932
 
4.1%
Other values (56) 732032
42.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1724719
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 189862
 
11.0%
_ 137375
 
8.0%
3 106332
 
6.2%
1 90559
 
5.3%
2 84027
 
4.9%
8 82093
 
4.8%
5 79942
 
4.6%
4 78529
 
4.6%
9 73036
 
4.2%
7 70932
 
4.1%
Other values (56) 732032
42.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1724719
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 189862
 
11.0%
_ 137375
 
8.0%
3 106332
 
6.2%
1 90559
 
5.3%
2 84027
 
4.9%
8 82093
 
4.8%
5 79942
 
4.6%
4 78529
 
4.6%
9 73036
 
4.2%
7 70932
 
4.1%
Other values (56) 732032
42.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1724719
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 189862
 
11.0%
_ 137375
 
8.0%
3 106332
 
6.2%
1 90559
 
5.3%
2 84027
 
4.9%
8 82093
 
4.8%
5 79942
 
4.6%
4 78529
 
4.6%
9 73036
 
4.2%
7 70932
 
4.1%
Other values (56) 732032
42.4%

Protein_source
Categorical

High correlation  Imbalance 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.1 MiB
prodigal
49142 
RefSeq
 
567
Genbank
 
256
DDBJ
 
22
EMBL
 
13

Length

Max length8
Median length8
Mean length7.9694
Min length4

Characters and Unicode

Total characters398470
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRefSeq
2nd rowRefSeq
3rd rowRefSeq
4th rowRefSeq
5th rowRefSeq

Common Values

ValueCountFrequency (%)
prodigal 49142
98.3%
RefSeq 567
 
1.1%
Genbank 256
 
0.5%
DDBJ 22
 
< 0.1%
EMBL 13
 
< 0.1%

Length

2025-07-15T22:29:06.325669image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T22:29:06.387554image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
prodigal 49142
98.3%
refseq 567
 
1.1%
genbank 256
 
0.5%
ddbj 22
 
< 0.1%
embl 13
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
a 49398
12.4%
r 49142
12.3%
p 49142
12.3%
o 49142
12.3%
d 49142
12.3%
i 49142
12.3%
g 49142
12.3%
l 49142
12.3%
e 1390
 
0.3%
R 567
 
0.1%
Other values (13) 3121
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 398470
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 49398
12.4%
r 49142
12.3%
p 49142
12.3%
o 49142
12.3%
d 49142
12.3%
i 49142
12.3%
g 49142
12.3%
l 49142
12.3%
e 1390
 
0.3%
R 567
 
0.1%
Other values (13) 3121
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 398470
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 49398
12.4%
r 49142
12.3%
p 49142
12.3%
o 49142
12.3%
d 49142
12.3%
i 49142
12.3%
g 49142
12.3%
l 49142
12.3%
e 1390
 
0.3%
R 567
 
0.1%
Other values (13) 3121
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 398470
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 49398
12.4%
r 49142
12.3%
p 49142
12.3%
o 49142
12.3%
d 49142
12.3%
i 49142
12.3%
g 49142
12.3%
l 49142
12.3%
e 1390
 
0.3%
R 567
 
0.1%
Other values (13) 3121
 
0.8%

Function_prediction_source
Categorical

High correlation  Missing 

Distinct7
Distinct (%)< 0.1%
Missing22743
Missing (%)45.5%
Memory size3.0 MiB
eggNOG-mapper
10996 
Iterative search
9884 
-
5519 
RefSeq
 
567
Genbank
 
256
Other values (2)
 
35

Length

Max length16
Median length13
Mean length11.444583
Min length1

Characters and Unicode

Total characters311945
Distinct characters31
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRefSeq
2nd rowRefSeq
3rd rowRefSeq
4th rowRefSeq
5th rowRefSeq

Common Values

ValueCountFrequency (%)
eggNOG-mapper 10996
22.0%
Iterative search 9884
19.8%
- 5519
 
11.0%
RefSeq 567
 
1.1%
Genbank 256
 
0.5%
DDBJ 22
 
< 0.1%
EMBL 13
 
< 0.1%
(Missing) 22743
45.5%

Length

2025-07-15T22:29:06.451580image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T22:29:06.515246image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
eggnog-mapper 10996
29.6%
iterative 9884
26.6%
search 9884
26.6%
5519
14.9%
refseq 567
 
1.5%
genbank 256
 
0.7%
ddbj 22
 
0.1%
embl 13
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 53034
17.0%
a 31020
 
9.9%
r 30764
 
9.9%
g 21992
 
7.0%
p 21992
 
7.0%
t 19768
 
6.3%
- 16515
 
5.3%
G 11252
 
3.6%
m 10996
 
3.5%
N 10996
 
3.5%
Other values (21) 83616
26.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 311945
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 53034
17.0%
a 31020
 
9.9%
r 30764
 
9.9%
g 21992
 
7.0%
p 21992
 
7.0%
t 19768
 
6.3%
- 16515
 
5.3%
G 11252
 
3.6%
m 10996
 
3.5%
N 10996
 
3.5%
Other values (21) 83616
26.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 311945
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 53034
17.0%
a 31020
 
9.9%
r 30764
 
9.9%
g 21992
 
7.0%
p 21992
 
7.0%
t 19768
 
6.3%
- 16515
 
5.3%
G 11252
 
3.6%
m 10996
 
3.5%
N 10996
 
3.5%
Other values (21) 83616
26.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 311945
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 53034
17.0%
a 31020
 
9.9%
r 30764
 
9.9%
g 21992
 
7.0%
p 21992
 
7.0%
t 19768
 
6.3%
- 16515
 
5.3%
G 11252
 
3.6%
m 10996
 
3.5%
N 10996
 
3.5%
Other values (21) 83616
26.8%

Start
Real number (ℝ)

High correlation 

Distinct34163
Distinct (%)68.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29004.409
Minimum1
Maximum475561
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2025-07-15T22:29:06.589528image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1417.95
Q18983
median20791.5
Q337135.5
95-th percentile87222.05
Maximum475561
Range475560
Interquartile range (IQR)28152.5

Descriptive statistics

Standard deviation31420.546
Coefficient of variation (CV)1.0833024
Kurtosis15.777577
Mean29004.409
Median Absolute Deviation (MAD)13356.5
Skewness3.0413867
Sum1.4502204 × 109
Variance9.872507 × 108
MonotonicityNot monotonic
2025-07-15T22:29:06.664031image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 245
 
0.5%
2 175
 
0.4%
3 161
 
0.3%
17781 9
 
< 0.1%
50 8
 
< 0.1%
10213 7
 
< 0.1%
1272 7
 
< 0.1%
868 7
 
< 0.1%
3446 7
 
< 0.1%
26335 7
 
< 0.1%
Other values (34153) 49367
98.7%
ValueCountFrequency (%)
1 245
0.5%
2 175
0.4%
3 161
0.3%
5 2
 
< 0.1%
6 2
 
< 0.1%
7 1
 
< 0.1%
8 1
 
< 0.1%
12 1
 
< 0.1%
13 1
 
< 0.1%
14 1
 
< 0.1%
ValueCountFrequency (%)
475561 1
< 0.1%
437464 1
< 0.1%
423051 1
< 0.1%
380045 1
< 0.1%
375840 1
< 0.1%
374983 1
< 0.1%
360500 1
< 0.1%
360257 1
< 0.1%
357355 1
< 0.1%
356721 1
< 0.1%

Stop
Real number (ℝ)

High correlation 

Distinct34455
Distinct (%)68.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29693.313
Minimum60
Maximum475914
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2025-07-15T22:29:06.737662image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum60
5-th percentile2049.95
Q19712
median21529
Q337815
95-th percentile87792.65
Maximum475914
Range475854
Interquartile range (IQR)28103

Descriptive statistics

Standard deviation31426.256
Coefficient of variation (CV)1.0583614
Kurtosis15.793947
Mean29693.313
Median Absolute Deviation (MAD)13347
Skewness3.0431639
Sum1.4846656 × 109
Variance9.8760958 × 108
MonotonicityNot monotonic
2025-07-15T22:29:06.815268image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3690 7
 
< 0.1%
25677 7
 
< 0.1%
9299 7
 
< 0.1%
2020 7
 
< 0.1%
9035 7
 
< 0.1%
2488 7
 
< 0.1%
4558 7
 
< 0.1%
12925 7
 
< 0.1%
2502 6
 
< 0.1%
7254 6
 
< 0.1%
Other values (34445) 49932
99.9%
ValueCountFrequency (%)
60 1
 
< 0.1%
66 3
< 0.1%
69 3
< 0.1%
70 1
 
< 0.1%
71 1
 
< 0.1%
72 2
< 0.1%
73 1
 
< 0.1%
75 2
< 0.1%
76 1
 
< 0.1%
78 1
 
< 0.1%
ValueCountFrequency (%)
475914 1
< 0.1%
437634 1
< 0.1%
423518 1
< 0.1%
380653 1
< 0.1%
377492 1
< 0.1%
377142 1
< 0.1%
361126 1
< 0.1%
360385 1
< 0.1%
358218 1
< 0.1%
357122 1
< 0.1%

Strand
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
+
25135 
-
24865 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row+
2nd row+
3rd row+
4th row+
5th row+

Common Values

ValueCountFrequency (%)
+ 25135
50.3%
- 24865
49.7%

Length

2025-07-15T22:29:06.885641image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T22:29:06.937245image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
50000
100.0%

Most occurring characters

ValueCountFrequency (%)
+ 25135
50.3%
- 24865
49.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 50000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
+ 25135
50.3%
- 24865
49.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 50000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
+ 25135
50.3%
- 24865
49.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 50000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
+ 25135
50.3%
- 24865
49.7%

Protein_ID
Text

Unique 

Distinct50000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-07-15T22:29:07.037367image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length90
Median length86
Mean length37.3604
Min length8

Characters and Unicode

Total characters1868020
Distinct characters66
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50000 ?
Unique (%)100.0%

Sample

1st rowNP_040636.1
2nd rowNP_042327.1
3rd rowNP_044830.1
4th rowNP_046963.1
5th rowNP_052068.1
ValueCountFrequency (%)
yp_239040.1 1
 
< 0.1%
biochar_5611_7 1
 
< 0.1%
np_040636.1 1
 
< 0.1%
np_042327.1 1
 
< 0.1%
np_044830.1 1
 
< 0.1%
np_046963.1 1
 
< 0.1%
np_052068.1 1
 
< 0.1%
np_061623.1 1
 
< 0.1%
np_073699.1 1
 
< 0.1%
np_463475.1 1
 
< 0.1%
Other values (49990) 49990
> 99.9%
2025-07-15T22:29:07.234059image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 195699
 
10.5%
_ 186517
 
10.0%
3 117947
 
6.3%
1 108031
 
5.8%
2 97342
 
5.2%
5 88737
 
4.8%
4 88675
 
4.7%
8 88258
 
4.7%
9 78991
 
4.2%
7 77675
 
4.2%
Other values (56) 740148
39.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1868020
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 195699
 
10.5%
_ 186517
 
10.0%
3 117947
 
6.3%
1 108031
 
5.8%
2 97342
 
5.2%
5 88737
 
4.8%
4 88675
 
4.7%
8 88258
 
4.7%
9 78991
 
4.2%
7 77675
 
4.2%
Other values (56) 740148
39.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1868020
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 195699
 
10.5%
_ 186517
 
10.0%
3 117947
 
6.3%
1 108031
 
5.8%
2 97342
 
5.2%
5 88737
 
4.8%
4 88675
 
4.7%
8 88258
 
4.7%
9 78991
 
4.2%
7 77675
 
4.2%
Other values (56) 740148
39.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1868020
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 195699
 
10.5%
_ 186517
 
10.0%
3 117947
 
6.3%
1 108031
 
5.8%
2 97342
 
5.2%
5 88737
 
4.8%
4 88675
 
4.7%
8 88258
 
4.7%
9 78991
 
4.2%
7 77675
 
4.2%
Other values (56) 740148
39.6%
Distinct4015
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
2025-07-15T22:29:07.350031image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1357
Median length769
Mean length26.1082
Min length2

Characters and Unicode

Total characters1305410
Distinct characters80
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1764 ?
Unique (%)3.5%

Sample

1st rowNinD protein
2nd rowDNA polymerase
3rd rowhypothetical protein
4th rowterminase small subunit
5th rowputative 0.6A protein
ValueCountFrequency (%)
unknown 19746
 
11.7%
protein 12889
 
7.6%
of 4841
 
2.9%
hypothetical 4341
 
2.6%
the 4085
 
2.4%
domain 3773
 
2.2%
phage 3275
 
1.9%
family 2919
 
1.7%
dna 2885
 
1.7%
to 2200
 
1.3%
Other values (5428) 108085
63.9%
2025-07-15T22:29:07.540093image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 129006
 
9.9%
119054
 
9.1%
e 101038
 
7.7%
o 99410
 
7.6%
i 88632
 
6.8%
t 83003
 
6.4%
a 75631
 
5.8%
r 57145
 
4.4%
s 49094
 
3.8%
l 47479
 
3.6%
Other values (70) 455918
34.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1305410
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 129006
 
9.9%
119054
 
9.1%
e 101038
 
7.7%
o 99410
 
7.6%
i 88632
 
6.8%
t 83003
 
6.4%
a 75631
 
5.8%
r 57145
 
4.4%
s 49094
 
3.8%
l 47479
 
3.6%
Other values (70) 455918
34.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1305410
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 129006
 
9.9%
119054
 
9.1%
e 101038
 
7.7%
o 99410
 
7.6%
i 88632
 
6.8%
t 83003
 
6.4%
a 75631
 
5.8%
r 57145
 
4.4%
s 49094
 
3.8%
l 47479
 
3.6%
Other values (70) 455918
34.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1305410
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 129006
 
9.9%
119054
 
9.1%
e 101038
 
7.7%
o 99410
 
7.6%
i 88632
 
6.8%
t 83003
 
6.4%
a 75631
 
5.8%
r 57145
 
4.4%
s 49094
 
3.8%
l 47479
 
3.6%
Other values (70) 455918
34.9%
Distinct67
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size3.2 MiB
2025-07-15T22:29:07.606267image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length9
Mean length10.48208
Min length6

Characters and Unicode

Total characters524104
Distinct characters25
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st rowunsorted;
2nd rowreplication;
3rd rowhypothetical;
4th rowpackaging;
5th rowunsorted;
ValueCountFrequency (%)
unsorted 27529
55.1%
hypothetical 4342
 
8.7%
assembly 3576
 
7.2%
replication 2475
 
5.0%
infection 2034
 
4.1%
packaging 1723
 
3.4%
assembly;infection 1604
 
3.2%
lysis 1400
 
2.8%
integration 1230
 
2.5%
regulation 1060
 
2.1%
Other values (57) 3027
 
6.1%
2025-07-15T22:29:07.737479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
; 54024
10.3%
e 50355
9.6%
t 49825
9.5%
n 47552
9.1%
o 43497
 
8.3%
s 42397
 
8.1%
r 35443
 
6.8%
u 30588
 
5.8%
i 30103
 
5.7%
d 27726
 
5.3%
Other values (15) 112594
21.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 524104
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
; 54024
10.3%
e 50355
9.6%
t 49825
9.5%
n 47552
9.1%
o 43497
 
8.3%
s 42397
 
8.1%
r 35443
 
6.8%
u 30588
 
5.8%
i 30103
 
5.7%
d 27726
 
5.3%
Other values (15) 112594
21.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 524104
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
; 54024
10.3%
e 50355
9.6%
t 49825
9.5%
n 47552
9.1%
o 43497
 
8.3%
s 42397
 
8.1%
r 35443
 
6.8%
u 30588
 
5.8%
i 30103
 
5.7%
d 27726
 
5.3%
Other values (15) 112594
21.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 524104
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
; 54024
10.3%
e 50355
9.6%
t 49825
9.5%
n 47552
9.1%
o 43497
 
8.3%
s 42397
 
8.1%
r 35443
 
6.8%
u 30588
 
5.8%
i 30103
 
5.7%
d 27726
 
5.3%
Other values (15) 112594
21.5%

Molecular_weight
Real number (ℝ)

High correlation 

Distinct44387
Distinct (%)88.9%
Missing67
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean4125.5155
Minimum75.0666
Maximum8913.4407
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2025-07-15T22:29:07.806411image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum75.0666
5-th percentile424.4482
Q12023.2453
median4210.6971
Q36223.9149
95-th percentile7677.612
Maximum8913.4407
Range8838.3741
Interquartile range (IQR)4200.6696

Descriptive statistics

Standard deviation2368.6116
Coefficient of variation (CV)0.57413711
Kurtosis-1.2447258
Mean4125.5155
Median Absolute Deviation (MAD)2096.3227
Skewness-0.044836548
Sum2.0599937 × 108
Variance5610320.7
MonotonicityNot monotonic
2025-07-15T22:29:07.871479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
146.1876 99
 
0.2%
131.1729 95
 
0.2%
147.1293 91
 
0.2%
89.0932 67
 
0.1%
174.201 60
 
0.1%
117.1463 58
 
0.1%
105.0926 58
 
0.1%
133.1027 53
 
0.1%
75.0666 45
 
0.1%
146.1445 41
 
0.1%
Other values (44377) 49266
98.5%
(Missing) 67
 
0.1%
ValueCountFrequency (%)
75.0666 45
0.1%
89.0932 67
0.1%
105.0926 58
0.1%
115.1305 13
 
< 0.1%
117.1463 58
0.1%
119.1192 19
 
< 0.1%
121.1582 7
 
< 0.1%
131.1729 95
0.2%
132.1179 29
 
0.1%
133.1027 53
0.1%
ValueCountFrequency (%)
8913.4407 1
< 0.1%
8906.7809 1
< 0.1%
8766.9951 1
< 0.1%
8745.5145 1
< 0.1%
8722.928 1
< 0.1%
8711.0953 1
< 0.1%
8702.5438 1
< 0.1%
8696.8709 1
< 0.1%
8676.686 1
< 0.1%
8671.135 1
< 0.1%

Aromaticity
Real number (ℝ)

High correlation  Zeros 

Distinct470
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.090285378
Minimum0
Maximum1
Zeros8299
Zeros (%)16.6%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2025-07-15T22:29:07.937491image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.041666667
median0.083333333
Q30.125
95-th percentile0.2
Maximum1
Range1
Interquartile range (IQR)0.083333333

Descriptive statistics

Standard deviation0.080309898
Coefficient of variation (CV)0.88951167
Kurtosis30.458213
Mean0.090285378
Median Absolute Deviation (MAD)0.041666667
Skewness3.5284955
Sum4514.2689
Variance0.0064496797
MonotonicityNot monotonic
2025-07-15T22:29:08.011844image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8299
 
16.6%
0.1428571429 1047
 
2.1%
0.1 1011
 
2.0%
0.125 999
 
2.0%
0.1111111111 973
 
1.9%
0.09090909091 958
 
1.9%
0.1666666667 826
 
1.7%
0.07692307692 800
 
1.6%
0.08333333333 782
 
1.6%
0.07142857143 729
 
1.5%
Other values (460) 33576
67.2%
ValueCountFrequency (%)
0 8299
16.6%
0.01428571429 20
 
< 0.1%
0.01449275362 17
 
< 0.1%
0.01470588235 18
 
< 0.1%
0.01492537313 16
 
< 0.1%
0.01515151515 22
 
< 0.1%
0.01538461538 28
 
0.1%
0.015625 19
 
< 0.1%
0.01587301587 22
 
< 0.1%
0.01612903226 24
 
< 0.1%
ValueCountFrequency (%)
1 84
0.2%
0.6666666667 18
 
< 0.1%
0.6 8
 
< 0.1%
0.6 2
 
< 0.1%
0.5526315789 1
 
< 0.1%
0.5 178
0.4%
0.4666666667 3
 
< 0.1%
0.4615384615 1
 
< 0.1%
0.4545454545 1
 
< 0.1%
0.4444444444 6
 
< 0.1%

Instability_index
Real number (ℝ)

Zeros 

Distinct39095
Distinct (%)78.3%
Missing67
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean35.583318
Minimum-86.5
Maximum388.53333
Zeros827
Zeros (%)1.7%
Negative3310
Negative (%)6.6%
Memory size390.8 KiB
2025-07-15T22:29:08.078179image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum-86.5
5-th percentile-4.221746
Q117.705405
median33.626087
Q350.304
95-th percentile82.6625
Maximum388.53333
Range475.03333
Interquartile range (IQR)32.598595

Descriptive statistics

Standard deviation29.020131
Coefficient of variation (CV)0.81555438
Kurtosis5.0529155
Mean35.583318
Median Absolute Deviation (MAD)16.294658
Skewness1.0631055
Sum1776781.8
Variance842.16798
MonotonicityNot monotonic
2025-07-15T22:29:08.143574image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 827
 
1.7%
5 520
 
1.0%
6.666666667 319
 
0.6%
7.5 211
 
0.4%
8 134
 
0.3%
-13.725 95
 
0.2%
8.333333333 94
 
0.2%
-21.63333333 91
 
0.2%
55.65 79
 
0.2%
-37.45 78
 
0.2%
Other values (39085) 47485
95.0%
ValueCountFrequency (%)
-86.5 1
 
< 0.1%
-77.225 1
 
< 0.1%
-74.83333333 2
 
< 0.1%
-73 1
 
< 0.1%
-72.525 3
 
< 0.1%
-71.73333333 1
 
< 0.1%
-70.15 1
 
< 0.1%
-70.15 22
< 0.1%
-69.1 3
 
< 0.1%
-68.56666667 1
 
< 0.1%
ValueCountFrequency (%)
388.5333333 1
 
< 0.1%
291.4 4
< 0.1%
272.675 1
 
< 0.1%
263.4666667 1
 
< 0.1%
261.8 4
< 0.1%
260.8 1
 
< 0.1%
249.3166667 1
 
< 0.1%
247.625 1
 
< 0.1%
243.4571429 1
 
< 0.1%
241.5111111 1
 
< 0.1%

Isoelectric_point
Real number (ℝ)

Distinct19764
Distinct (%)39.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.8159599
Minimum4.0500284
Maximum11.999968
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2025-07-15T22:29:08.208805image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum4.0500284
5-th percentile4.0500284
Q14.6234201
median6.0697451
Q39.1382807
95-th percentile10.61255
Maximum11.999968
Range7.9499393
Interquartile range (IQR)4.5148605

Descriptive statistics

Standard deviation2.3580697
Coefficient of variation (CV)0.34596297
Kurtosis-1.2739514
Mean6.8159599
Median Absolute Deviation (MAD)1.9132004
Skewness0.41196447
Sum340798
Variance5.5604928
MonotonicityNot monotonic
2025-07-15T22:29:08.281169image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.050028419 4589
 
9.2%
5.525000191 804
 
1.6%
11.99996777 521
 
1.0%
8.750052071 409
 
0.8%
9.750021172 261
 
0.5%
5.57001667 161
 
0.3%
5.240009499 148
 
0.3%
5.494989204 148
 
0.3%
11.00083675 146
 
0.3%
10.00273724 141
 
0.3%
Other values (19754) 42672
85.3%
ValueCountFrequency (%)
4.050028419 4589
9.2%
4.05110836 1
 
< 0.1%
4.051335716 1
 
< 0.1%
4.051790428 1
 
< 0.1%
4.052586174 2
 
< 0.1%
4.052699852 1
 
< 0.1%
4.052984047 1
 
< 0.1%
4.053040886 1
 
< 0.1%
4.053097725 1
 
< 0.1%
4.053211403 4
 
< 0.1%
ValueCountFrequency (%)
11.99996777 521
1.0%
11.92157421 1
 
< 0.1%
11.91712589 1
 
< 0.1%
11.91532078 1
 
< 0.1%
11.91480503 2
 
< 0.1%
11.91254864 2
 
< 0.1%
11.91196842 1
 
< 0.1%
11.91042118 1
 
< 0.1%
11.90810032 1
 
< 0.1%
11.90552158 1
 
< 0.1%

Helix_fraction
Real number (ℝ)

Zeros 

Distinct1165
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2954387
Minimum0
Maximum1
Zeros2146
Zeros (%)4.3%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2025-07-15T22:29:08.351996image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.083333333
Q10.23529412
median0.296875
Q30.35294118
95-th percentile0.5
Maximum1
Range1
Interquartile range (IQR)0.11764706

Descriptive statistics

Standard deviation0.12606572
Coefficient of variation (CV)0.42670688
Kurtosis6.2092713
Mean0.2954387
Median Absolute Deviation (MAD)0.058680556
Skewness0.9435565
Sum14771.935
Variance0.015892567
MonotonicityNot monotonic
2025-07-15T22:29:08.418274image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.3333333333 2507
 
5.0%
0 2146
 
4.3%
0.25 1566
 
3.1%
0.2857142857 1141
 
2.3%
0.5 1029
 
2.1%
0.2 918
 
1.8%
0.3 786
 
1.6%
0.4 727
 
1.5%
0.375 620
 
1.2%
0.2727272727 608
 
1.2%
Other values (1155) 37952
75.9%
ValueCountFrequency (%)
0 2146
4.3%
0.01923076923 2
 
< 0.1%
0.02173913043 2
 
< 0.1%
0.02222222222 5
 
< 0.1%
0.02272727273 5
 
< 0.1%
0.02380952381 1
 
< 0.1%
0.025 1
 
< 0.1%
0.02564102564 3
 
< 0.1%
0.02631578947 2
 
< 0.1%
0.02777777778 3
 
< 0.1%
ValueCountFrequency (%)
1 318
0.6%
0.875 2
 
< 0.1%
0.8571428571 2
 
< 0.1%
0.8571428571 4
 
< 0.1%
0.8421052632 1
 
< 0.1%
0.8333333333 1
 
< 0.1%
0.8333333333 2
 
< 0.1%
0.8 11
 
< 0.1%
0.7826086957 2
 
< 0.1%
0.75 45
 
0.1%

Turn_fraction
Real number (ℝ)

Zeros 

Distinct888
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.20626721
Minimum0
Maximum1
Zeros3063
Zeros (%)6.1%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2025-07-15T22:29:08.484391image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.14285714
median0.2
Q30.25641026
95-th percentile0.38461538
Maximum1
Range1
Interquartile range (IQR)0.11355311

Descriptive statistics

Standard deviation0.11427894
Coefficient of variation (CV)0.55403347
Kurtosis9.0603218
Mean0.20626721
Median Absolute Deviation (MAD)0.057142857
Skewness1.6850346
Sum10313.361
Variance0.013059676
MonotonicityNot monotonic
2025-07-15T22:29:08.551579image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3063
 
6.1%
0.25 1587
 
3.2%
0.2 1535
 
3.1%
0.1666666667 1380
 
2.8%
0.3333333333 1104
 
2.2%
0.1428571429 1043
 
2.1%
0.2222222222 758
 
1.5%
0.1818181818 689
 
1.4%
0.2857142857 684
 
1.4%
0.125 672
 
1.3%
Other values (878) 37485
75.0%
ValueCountFrequency (%)
0 3063
6.1%
0.01886792453 2
 
< 0.1%
0.01923076923 2
 
< 0.1%
0.01960784314 1
 
< 0.1%
0.02040816327 1
 
< 0.1%
0.02083333333 1
 
< 0.1%
0.02127659574 1
 
< 0.1%
0.02173913043 1
 
< 0.1%
0.02272727273 2
 
< 0.1%
0.02380952381 1
 
< 0.1%
ValueCountFrequency (%)
1 180
0.4%
0.9090909091 1
 
< 0.1%
0.8888888889 1
 
< 0.1%
0.8484848485 1
 
< 0.1%
0.8333333333 3
 
< 0.1%
0.8 7
 
< 0.1%
0.75 35
 
0.1%
0.7407407407 1
 
< 0.1%
0.7272727273 2
 
< 0.1%
0.724137931 2
 
< 0.1%

Sheet_fraction
Real number (ℝ)

Zeros 

Distinct994
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.25590281
Minimum0
Maximum1
Zeros2487
Zeros (%)5.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2025-07-15T22:29:08.620342image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.031202652
Q10.18634214
median0.25
Q30.31818182
95-th percentile0.45
Maximum1
Range1
Interquartile range (IQR)0.13183968

Descriptive statistics

Standard deviation0.12700454
Coefficient of variation (CV)0.49629989
Kurtosis6.2659779
Mean0.25590281
Median Absolute Deviation (MAD)0.065789474
Skewness1.2451369
Sum12795.14
Variance0.016130152
MonotonicityNot monotonic
2025-07-15T22:29:08.687741image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2487
 
5.0%
0.3333333333 1849
 
3.7%
0.25 1749
 
3.5%
0.2 1257
 
2.5%
0.2857142857 871
 
1.7%
0.5 868
 
1.7%
0.1666666667 851
 
1.7%
0.2222222222 728
 
1.5%
0.1428571429 643
 
1.3%
0.3 570
 
1.1%
Other values (984) 38127
76.3%
ValueCountFrequency (%)
0 2487
5.0%
0.01886792453 1
 
< 0.1%
0.02222222222 1
 
< 0.1%
0.02380952381 1
 
< 0.1%
0.0243902439 1
 
< 0.1%
0.02702702703 1
 
< 0.1%
0.02777777778 3
 
< 0.1%
0.02857142857 2
 
< 0.1%
0.02941176471 1
 
< 0.1%
0.0303030303 2
 
< 0.1%
ValueCountFrequency (%)
1 271
0.5%
0.8571428571 2
 
< 0.1%
0.8571428571 1
 
< 0.1%
0.8333333333 2
 
< 0.1%
0.8 20
 
< 0.1%
0.7857142857 1
 
< 0.1%
0.7777777778 1
 
< 0.1%
0.7692307692 1
 
< 0.1%
0.75 59
 
0.1%
0.7272727273 1
 
< 0.1%

Reduced_coefficient
Real number (ℝ)

High correlation  Zeros 

Distinct71
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4941.6168
Minimum0
Maximum60500
Zeros13787
Zeros (%)27.6%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2025-07-15T22:29:08.750852image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2980
Q37450
95-th percentile15470
Maximum60500
Range60500
Interquartile range (IQR)7450

Descriptive statistics

Standard deviation5502.9335
Coefficient of variation (CV)1.1135897
Kurtosis2.6408864
Mean4941.6168
Median Absolute Deviation (MAD)2980
Skewness1.4943833
Sum2.4708084 × 108
Variance30282277
MonotonicityNot monotonic
2025-07-15T22:29:08.816514image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 13787
27.6%
1490 8254
16.5%
2980 5001
 
10.0%
6990 3030
 
6.1%
4470 2881
 
5.8%
5500 2844
 
5.7%
8480 2521
 
5.0%
9970 1594
 
3.2%
5960 1531
 
3.1%
12490 1027
 
2.1%
Other values (61) 7530
15.1%
ValueCountFrequency (%)
0 13787
27.6%
1490 8254
16.5%
2980 5001
 
10.0%
4470 2881
 
5.8%
5500 2844
 
5.7%
5960 1531
 
3.1%
6990 3030
 
6.1%
7450 688
 
1.4%
8480 2521
 
5.0%
8940 311
 
0.6%
ValueCountFrequency (%)
60500 1
 
< 0.1%
45490 1
 
< 0.1%
44000 1
 
< 0.1%
39990 2
 
< 0.1%
38500 2
 
< 0.1%
37470 1
 
< 0.1%
35980 2
 
< 0.1%
34490 8
< 0.1%
33920 1
 
< 0.1%
33460 4
< 0.1%

Oxidized_coefficient
Real number (ℝ)

High correlation  Zeros 

Distinct218
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4959.5943
Minimum0
Maximum60625
Zeros13261
Zeros (%)26.5%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2025-07-15T22:29:08.885886image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2980
Q37450
95-th percentile15720
Maximum60625
Range60625
Interquartile range (IQR)7450

Descriptive statistics

Standard deviation5512.5224
Coefficient of variation (CV)1.1114866
Kurtosis2.6293851
Mean4959.5943
Median Absolute Deviation (MAD)2980
Skewness1.4917394
Sum2.4797972 × 108
Variance30387903
MonotonicityNot monotonic
2025-07-15T22:29:08.953937image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 13261
26.5%
1490 7606
15.2%
2980 4343
 
8.7%
6990 2640
 
5.3%
5500 2608
 
5.2%
4470 2470
 
4.9%
8480 2110
 
4.2%
9970 1318
 
2.6%
5960 1227
 
2.5%
12490 845
 
1.7%
Other values (208) 11572
23.1%
ValueCountFrequency (%)
0 13261
26.5%
125 438
 
0.9%
250 75
 
0.1%
375 10
 
< 0.1%
500 2
 
< 0.1%
750 1
 
< 0.1%
1490 7606
15.2%
1615 522
 
1.0%
1740 107
 
0.2%
1865 18
 
< 0.1%
ValueCountFrequency (%)
60625 1
< 0.1%
45490 1
< 0.1%
44000 1
< 0.1%
40365 1
< 0.1%
40115 1
< 0.1%
38625 1
< 0.1%
38500 1
< 0.1%
37470 1
< 0.1%
36105 1
< 0.1%
35980 1
< 0.1%

Phage_source
Categorical

High correlation 

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.9 MiB
IMG_VR
14007 
MGV
12240 
GPD
8797 
GOV2
5915 
TemPhD
4087 
Other values (9)
4954 

Length

Max length8
Median length7
Mean length4.35226
Min length3

Characters and Unicode

Total characters217613
Distinct characters29
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRefSeq
2nd rowRefSeq
3rd rowRefSeq
4th rowRefSeq
5th rowRefSeq

Common Values

ValueCountFrequency (%)
IMG_VR 14007
28.0%
MGV 12240
24.5%
GPD 8797
17.6%
GOV2 5915
11.8%
TemPhD 4087
 
8.2%
CHVD 2250
 
4.5%
GVD 871
 
1.7%
RefSeq 567
 
1.1%
PhagesDB 404
 
0.8%
IGVD 386
 
0.8%
Other values (4) 476
 
1.0%

Length

2025-07-15T22:29:09.017228image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
img_vr 14007
28.0%
mgv 12240
24.5%
gpd 8797
17.6%
gov2 5915
11.8%
temphd 4087
 
8.2%
chvd 2250
 
4.5%
gvd 871
 
1.7%
refseq 567
 
1.1%
phagesdb 404
 
0.8%
igvd 386
 
0.8%
Other values (4) 476
 
1.0%

Most occurring characters

ValueCountFrequency (%)
G 42472
19.5%
V 35854
16.5%
M 26260
12.1%
D 16839
 
7.7%
R 14574
 
6.7%
I 14393
 
6.6%
_ 14007
 
6.4%
P 13288
 
6.1%
O 5915
 
2.7%
2 5915
 
2.7%
Other values (19) 28096
12.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 217613
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
G 42472
19.5%
V 35854
16.5%
M 26260
12.1%
D 16839
 
7.7%
R 14574
 
6.7%
I 14393
 
6.6%
_ 14007
 
6.4%
P 13288
 
6.1%
O 5915
 
2.7%
2 5915
 
2.7%
Other values (19) 28096
12.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 217613
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
G 42472
19.5%
V 35854
16.5%
M 26260
12.1%
D 16839
 
7.7%
R 14574
 
6.7%
I 14393
 
6.6%
_ 14007
 
6.4%
P 13288
 
6.1%
O 5915
 
2.7%
2 5915
 
2.7%
Other values (19) 28096
12.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 217613
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
G 42472
19.5%
V 35854
16.5%
M 26260
12.1%
D 16839
 
7.7%
R 14574
 
6.7%
I 14393
 
6.6%
_ 14007
 
6.4%
P 13288
 
6.1%
O 5915
 
2.7%
2 5915
 
2.7%
Other values (19) 28096
12.9%

Function_Prediction_source
Categorical

High correlation  Missing 

Distinct3
Distinct (%)< 0.1%
Missing27257
Missing (%)54.5%
Memory size2.8 MiB
-
12448 
eggNOG-mapper
8657 
Iterative search
1638 

Length

Max length16
Median length1
Mean length6.6480675
Min length1

Characters and Unicode

Total characters151197
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st roweggNOG-mapper
2nd roweggNOG-mapper
3rd roweggNOG-mapper
4th roweggNOG-mapper
5th roweggNOG-mapper

Common Values

ValueCountFrequency (%)
- 12448
24.9%
eggNOG-mapper 8657
 
17.3%
Iterative search 1638
 
3.3%
(Missing) 27257
54.5%

Length

2025-07-15T22:29:09.075679image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T22:29:09.125169image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
12448
51.1%
eggnog-mapper 8657
35.5%
iterative 1638
 
6.7%
search 1638
 
6.7%

Most occurring characters

ValueCountFrequency (%)
e 22228
14.7%
- 21105
14.0%
g 17314
11.5%
p 17314
11.5%
a 11933
7.9%
r 11933
7.9%
G 8657
 
5.7%
O 8657
 
5.7%
N 8657
 
5.7%
m 8657
 
5.7%
Other values (8) 14742
9.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 151197
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 22228
14.7%
- 21105
14.0%
g 17314
11.5%
p 17314
11.5%
a 11933
7.9%
r 11933
7.9%
G 8657
 
5.7%
O 8657
 
5.7%
N 8657
 
5.7%
m 8657
 
5.7%
Other values (8) 14742
9.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 151197
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 22228
14.7%
- 21105
14.0%
g 17314
11.5%
p 17314
11.5%
a 11933
7.9%
r 11933
7.9%
G 8657
 
5.7%
O 8657
 
5.7%
N 8657
 
5.7%
m 8657
 
5.7%
Other values (8) 14742
9.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 151197
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 22228
14.7%
- 21105
14.0%
g 17314
11.5%
p 17314
11.5%
a 11933
7.9%
r 11933
7.9%
G 8657
 
5.7%
O 8657
 
5.7%
N 8657
 
5.7%
m 8657
 
5.7%
Other values (8) 14742
9.8%

Interactions

2025-07-15T22:29:04.604550image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:56.830129image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.471964image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.123223image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.781065image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.485634image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.191471image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.900844image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:01.575266image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.309091image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.946849image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.661813image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:56.889088image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.533645image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.178374image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.842231image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.546866image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.252861image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.959067image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:01.638328image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.362811image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.003968image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.723631image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:56.944850image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.587720image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.233602image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.901765image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.616158image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.315800image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:01.017100image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:02.750699image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.418092image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.061003image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.786402image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:56.999453image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.647340image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.287765image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.975313image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.674132image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.378730image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:01.076434image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:02.807637image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.474624image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.117129image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.846332image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.056647image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.707357image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.345353image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.034306image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.739194image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.443386image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:01.136688image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:02.868268image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.531987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.177947image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.915194image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.114789image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.764337image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.408500image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.098235image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.799983image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.510520image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:01.198566image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:02.928160image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.590746image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.236184image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.985218image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.178828image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.831440image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.470839image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.163046image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.876796image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.578736image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:01.265564image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:02.993942image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.652295image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.300724image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:05.050420image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.233631image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.889055image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.533118image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.227487image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.936723image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.644367image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:01.326103image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.053576image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.708399image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.359511image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:05.115617image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.295034image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.949177image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.592509image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.290322image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.001450image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.708560image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:01.387538image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.120584image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.763949image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.421862image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:05.181841image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.351981image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.005248image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.653680image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.354099image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.063444image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.769589image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:01.448362image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.180351image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.822119image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.480516image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:05.253408image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:57.411602image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.065954image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:58.716254image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:28:59.417126image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.129161image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:00.835792image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:01.513121image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.246854image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:03.883689image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-07-15T22:29:04.543654image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Correlations

2025-07-15T22:29:09.168769image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
AromaticityFunction_Prediction_sourceFunction_prediction_sourceHelix_fractionInstability_indexIsoelectric_pointMolecular_weightOxidized_coefficientPhage_sourceProtein_sourceReduced_coefficientSheet_fractionStartStopStrandTurn_fraction
Aromaticity1.0000.0090.0170.461-0.015-0.0110.1940.5950.0230.0030.599-0.2330.0380.0370.009-0.032
Function_Prediction_source0.0091.0000.0000.0430.0240.0370.0400.0080.3291.0000.0060.0500.0660.0630.0190.059
Function_prediction_source0.0170.0001.0000.0370.0110.0250.0180.0000.8221.0000.0000.0070.0930.0920.0510.031
Helix_fraction0.4610.0430.0371.000-0.137-0.0480.0640.2450.0170.0000.249-0.0720.0500.0480.006-0.193
Instability_index-0.0150.0240.011-0.1371.000-0.0380.1720.0810.0040.0000.0760.162-0.002-0.0030.0080.025
Isoelectric_point-0.0110.0370.025-0.048-0.0381.0000.0580.0310.0230.0110.032-0.272-0.002-0.0020.000-0.009
Molecular_weight0.1940.0400.0180.0640.1720.0581.0000.6270.0110.0000.6200.044-0.001-0.0020.0000.031
Oxidized_coefficient0.5950.0080.0000.2450.0810.0310.6271.0000.0070.0000.998-0.1170.0080.0070.0000.016
Phage_source0.0230.3290.8220.0170.0040.0230.0110.0071.0001.0000.0070.0230.0720.0720.0630.019
Protein_source0.0031.0001.0000.0000.0000.0110.0000.0001.0001.0000.0000.0050.0730.0720.0380.000
Reduced_coefficient0.5990.0060.0000.2490.0760.0320.6200.9980.0070.0001.000-0.1150.0080.0060.0000.015
Sheet_fraction-0.2330.0500.007-0.0720.162-0.2720.044-0.1170.0230.005-0.1151.000-0.023-0.0260.009-0.321
Start0.0380.0660.0930.050-0.002-0.002-0.0010.0080.0720.0730.008-0.0231.0000.9990.000-0.012
Stop0.0370.0630.0920.048-0.003-0.002-0.0020.0070.0720.0720.006-0.0260.9991.0000.000-0.007
Strand0.0090.0190.0510.0060.0080.0000.0000.0000.0630.0380.0000.0090.0000.0001.0000.000
Turn_fraction-0.0320.0590.031-0.1930.025-0.0090.0310.0160.0190.0000.015-0.321-0.012-0.0070.0001.000

Missing values

2025-07-15T22:29:05.418014image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
A simple visualization of nullity by column.
2025-07-15T22:29:05.656419image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-07-15T22:29:05.872669image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Phage_IDProtein_sourceFunction_prediction_sourceStartStopStrandProtein_IDProductProtein_classificationMolecular_weightAromaticityInstability_indexIsoelectric_pointHelix_fractionTurn_fractionSheet_fractionReduced_coefficientOxidized_coefficientPhage_sourceFunction_Prediction_source
0NC_001416.1RefSeqRefSeq4195042123+NP_040636.1NinD proteinunsorted;6978.82760.10526363.8754396.1830250.2105260.2280700.1929821948019730RefSeqNaN
1NC_001629.1RefSeqRefSeq1655016738+NP_042327.1DNA polymerasereplication;7795.92230.09677450.4177429.6607320.2419350.1290320.35483969906990RefSeqNaN
2NC_001825.1RefSeqRefSeq1078711548+NP_044830.1hypothetical proteinhypothetical;4674.02590.04651246.1302338.5968110.1860470.3023260.1395351100011000RefSeqNaN
3NC_001902.1RefSeqRefSeq44404961+NP_046963.1terminase small subunitpackaging;3523.60740.00000030.1303034.0500280.1818180.2121210.48484800RefSeqNaN
4NC_001271.1RefSeqRefSeq17581961+NP_052068.1putative 0.6A proteinunsorted;7912.35060.1940302.44791010.5023030.4477610.1343280.1791041146011460RefSeqNaN
5NC_002486.1RefSeqRefSeq1507515275+NP_061623.1DUF1514 family proteinunsorted;7869.27700.09090921.1424249.2051990.3939390.1363640.2878791146011460RefSeqNaN
6NC_002649.1RefSeqRefSeq1659017600+NP_073699.1terminasepackaging;6530.58100.10714312.2928579.9939050.3571430.1785710.23214399709970RefSeqNaN
7NC_003216.1RefSeqRefSeq84988920+NP_463475.1tail assembly chaperoneassembly;infection;8141.23910.08571453.1271435.4570780.2714290.0714290.31428674507575RefSeqNaN
8NC_003216.1RefSeqRefSeq2197622425-NP_463489.1anti-CRISPR protein AcrIIA1infection;1114.33560.000000-0.5444448.4978520.3333330.1111110.44444400RefSeqNaN
9NC_003298.1RefSeqRefSeq3477535044+NP_523344.1terminase small subunitpackaging;2187.31590.1052635.3157894.0500280.3157890.1578950.15789529802980RefSeqNaN
Phage_IDProtein_sourceFunction_prediction_sourceStartStopStrandProtein_IDProductProtein_classificationMolecular_weightAromaticityInstability_indexIsoelectric_pointHelix_fractionTurn_fractionSheet_fractionReduced_coefficientOxidized_coefficientPhage_sourceFunction_Prediction_source
49990biochar_4645prodigalNaN987810351-biochar_4645_15unknownunsorted;1744.86780.000000199.65882411.9999680.0000000.4705880.17647100STV-
49991biochar_4678prodigalNaN82878457-biochar_4678_16unknownunsorted;6529.19720.12500033.9251794.1768930.3571430.1607140.3035712097020970STV-
49992biochar_4840prodigalNaN75389232+biochar_4840_12unknownunsorted;428.48330.000000-13.7250009.1799920.0000000.5000000.00000000STV-
49993biochar_5076prodigalNaN50677295-biochar_5076_8translation initiation factor activityregulation;4110.43250.02381067.4238104.0500280.1190480.4761900.23809500STVIterative search
49994biochar_5302prodigalNaN58396039-biochar_5302_8unknownunsorted;7191.19120.03030346.1000009.9158340.1666670.2424240.31818214901490STV-
49995biochar_5324prodigalNaN37614024+biochar_5324_7unknownunsorted;1876.07840.05882433.5470595.4448570.1764710.2941180.2352940125STV-
49996biochar_5418prodigalNaN75917872-biochar_5418_17unknownunsorted;2265.65000.00000090.0260878.7478600.1739130.3478260.39130400STV-
49997biochar_5440prodigalNaN975510921+biochar_5440_19head-tail adaptorassembly;infection;4361.86760.17948728.1923086.9131220.3076920.3076920.23076984808480STVIterative search
49998biochar_5583prodigalNaN49375089+biochar_5583_9unknownunsorted;5802.61830.02000044.7920005.0175990.3400000.1400000.32000055005500STV-
49999biochar_5611prodigalNaN42896148-biochar_5611_7actin bindingreplication;6568.97550.01694950.0813564.9447310.1016950.1016950.33898300STVeggNOG-mapper